In-Place versus Re-Build versus Re-Merge: Index Maintenance Strategies for Text Retrieval Systems
نویسندگان
چکیده
Indexes are the key technology underpinning efficient text search. A range of algorithms have been developed for fast query evaluation and for index creation, but update algorithms for high-performance indexes have not been evaluated or even fully described. In this paper, we explore the three main alternative strategies for index update: in-place update, index merging, and complete re-build. Our experiments with large volumes of web data show that re-merge is for large numbers of updates the fastest approach, but in-place update is suitable when the rate of update is low or buffer size is limited.
منابع مشابه
Efficient online index maintenance for contiguous inverted lists q
Search engines and other text retrieval systems use high-performance inverted indexes to provide efficient text query evaluation. Algorithms for fast query evaluation and index construction are well-known, but relatively little has been published concerning update. In this paper, we experimentally evaluate the two main alternative strategies for index maintenance in the presence of insertions, ...
متن کاملEfficient online index maintenance for contiguous inverted lists q Nicholas Lester
Search engines and other text retrieval systems use high-performance inverted indexes to provide efficient text query evaluation. Algorithms for fast query evaluation and index construction are well-known, but relatively little has been published concerning update. In this paper, we experimentally evaluate the two main alternative strategies for index maintenance in the presence of insertions, ...
متن کاملEfficient online index maintenance for contiguous inverted lists
Search engines and other text retrieval systems use high-performance inverted indexes to provide efficient text query evaluation. Algorithms for fast query evaluation and index construction are well-known, but relatively little has been published concerning update. In this paper, we experimentally evaluate the two main alternative strategies for index maintenance in the presence of insertions, ...
متن کاملA Hybrid Approach to Index Maintenance in Dynamic Text Retrieval Systems
In-place and merge-based index maintenance are the two main competing strategies for on-line index construction in dynamic information retrieval systems based on inverted lists. Motivated by recent results for both strategies, we investigate possible combinations of in-place and merge-based index maintenance. We present a hybrid approach in which long posting lists are updated in-place, while s...
متن کاملInverted index maintenance strategy for flashSSDs: Revitalization of in-place index update strategy
An inverted index is a core data structure of Information Retrieval systems, especially in search engines. Since the search environments have become more dynamic, many on-line index maintenance strategies have been proposed. Previous strategies were designed for HDDs. Consequently, in order to avoid expensive random access cost, Merge-based strategies have been preferred to In-place index updat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004